Let’s Go To The Movies

Davin Dillon

2022-05-03

Paying for Oscars or Oscars for Pay?

Data Wrangling

There were many renames and some work to choose a movie budget in some cases. Once I had the columns I wanted, I set in on joining the data together. First, I joined the meta data with the budget data to include all of the titles for which I had information. My next adventure (or misadventure) was to join this data with the inflation data. Once that was done, all that was left was joining this information with the Oscar data. All in all, I used three full joins.

oscars <- oscars %>%    # rename year for join
  rename('year' = 'year_film')

meta$year <-  format(as.Date(meta$release_date, format = "%m/%d/%Y"), "%Y")

blue = '#000080' # just the color code I wanted to use for my print
  
meta <- meta %>% 
  mutate(year = as.numeric(year)) # easier use for comparisons/inflation

# renames
budget <- budget %>% 
  rename(vote_average = score) %>% # renaming vote data for joins etc
  rename(vote_count = votes) %>% 
  rename(Title = 'Movie Title')


oscars <- oscars %>% 
  rename(Title = 'film')  # rename for joins

adj <- inflation %>% 
  mutate(multiplier = (22.82/amount)) 
# create multiplier column for easy calculations

budget <- budget %>% 
  rename(new_budget = Budget)  # rename for joins etc


options(scipen = 100) # avoid scientific notation 

full_budget <- full_join(meta, budget, on = 'Title') 
## Joining, by = c("runtime", "Title", "vote_average", "vote_count", "year")
# full join of metadata and budget data to get new and old movies etc

full_bud <- full_budget %>% 
  mutate(budget = pmax(new_budget, meta_budget, na.rm = T)) %>% 
  select(Title,genres, budget,new_budget,
         meta_budget, popularity,year,
         release_date, revenue, runtime, vote_average,
         vote_count,gross)
# set budget to max of two different budgets.
# picking max is arbitrary, but needed in most cases


full_bud <- full_bud[-c(1,2,3),] %>%
   arrange(desc(as.numeric(budget)))
# remove first three unnecessary rows


full_bud <- full_bud %>%  # replace 1900 sentinels with 2022
  mutate(year = replace(year, year == 1900, 2022))
# Most of these seemed to be less known movies
# so I thought that 2022 would do the least harm 
# with the inflation numbers

adj_bud <- full_join(full_bud, adj,
                             on = c('Title', 'year')) %>% 
  mutate(with_inflation = (as.numeric(budget) * (multiplier))) %>% 
  mutate(gross_inflation = (as.numeric(full_bud$gross)
                            * (multiplier))) %>% 
  select(Title,genres,vote_average, vote_count, budget,
         with_inflation, gross_inflation, year, gross,
         release_date) %>% 
  arrange(desc(with_inflation))  # joining movies with inflation
## Joining, by = "year"
# creating with_inflation(budget) and gross_inflation columns
# arranging by highest with_inflation budgets


osc_bud <-  full_join(adj_bud, oscars, on  = c('Title'))
## Joining, by = c("Title", "year")
# joining all other data to Oscars data
# This is the starting point for most of my data manipulation
# 62706 entries, 15 total columns, of which 8 columns used.

Some Dataset Numbers

The data I was able to collect contains 4,834 movies nominated for an Academy Award since its inception in 1929. Of these 4,834 movies. 1,274 won at least one award. There have been 13,312 total Oscar nominations, and 3,001 total Oscar wins in the dataset. 559 movies have been nominated for Best Picture in its many forms. Out of these, 92 won. 1,154 movies have had an actor or actress nominated in either a leading or supporting role. 313 movies had at least one winner in an acting category.

Plotting Profit

Highest Percent Profit

Lowest Percent Profit

Who spends more money?

  • The average Oscar winner budget was $82,927,713.63.
  • The median Oscar winner budget was $52,732,524.55
  • The average Oscar loser budget was $73,041,808.78
  • The median Oscar loser budget was $49,789,090.91
  • The average non-nominated budget was $50,077,035.39.
  • The median non-nominated budget was 34,659,781.29

Who makes more money?

  • The average Oscar winner percent profit was 634.94.
  • The median percent profit for Oscar winners was 575.55
  • The average Oscar loser percent profit was 444.90.
  • The median percent profit for Oscar losers was 377.84.
  • The average non-nominated percent profit was 248.09.
  • The median percent profit for non-nominated movies was 161.21

Spent vs Made

Best Picture Budgets with Inflation

The minimum budget for a Best Picture winner with inflation was Marty with a cost of $3,674,769.95. The average budget for a Best Picture winner with inflation was $51,335,637.82. The maximum budget for a Best Picture winner with inflation was Titanic with a cost of $358,241,758.24.